Automated Word Stability and Language Phylogeny
نویسندگان
چکیده
The idea of measuring distance between languages seems to have its roots in the work of the French explorer Dumont D’Urville (1832). He collected comparative word lists of various languages during his voyages aboard the Astrolabe from 1826 to1829 and, in his work about the geographical division of the Pacific, he proposed a method to measure the degree of relationship among languages. The method used by modern glottochronology, developed by Morris Swadesh in the 1950s (Swadesh, 1952), measures distances from the percentage of shared cognates, which are words with a common historical origin. Recently, we proposed a new automated method which uses normalized Levenshtein distance among words with the same meaning and averages on the words contained in a list. Another classical problem in glottochronology is the study of the stability of words corresponding to different meanings. Words, in fact, evolve because of lexical changes, borrowings and replacement at a rate which is not the same for all of them. The speed of lexical evolution is different for different meanings and it is probably related to the frequency of use of the associated words (Pagel et al., 2007). This problem is tackled here by an automated methodology only based on normalized Levenshtein distance.
منابع مشابه
Cultural Phylogenetics of the Tupi Language Family in Lowland South America
BACKGROUND Recent advances in automated assessment of basic vocabulary lists allow the construction of linguistic phylogenies useful for tracing dynamics of human population expansions, reconstructing ancestral cultures, and modeling transition rates of cultural traits over time. METHODS Here we investigate the Tupi expansion, a widely-dispersed language family in lowland South America, with ...
متن کاملWord-Forming Process in Azeri Turkish Language
The subject intended to study the general methods of natural word-forming in Azeri Turkish language. This study aimed to reach this purpose by analyzing the construction of compound Azeri Turkish words. Same’ei (2016) did a comprehensive study on word-forming process in Farsi, which was the inspiration source of this study for Azeri Turkish language word-forming. Numerous scholars had done vari...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملSchemata-Building Role of Teaching Word History in Developing Reading Comprehension Ability
Methodologically, vocabulary instruction has faced significant ups and downs during the history of language education; sometimes integrated with the other elements of language network, other times tackled as a separate component. Among many variables supposedly affecting vocabulary achievement, the role of teaching word history, as a schemata-building strategy, in developing reading comprehensi...
متن کاملComplex First? On the Priority of Nouns in Language Acquisition and Evolution
The paper points to an apparent paradox in the science of language. It regards the semantics of nouns and consists of a set of together incompatible, but individually well confirmed propositions about the evolution and development of language, the semantics of word classes and the cortical realization of word meaning. Theoretical and empirical considerations support the view that the concepts e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Quantitative Linguistics
دوره 18 شماره
صفحات -
تاریخ انتشار 2011